WP: Visual Predictive Checks in Bayesian Workflows

Part 2

Teemu Säilynoja

2024-03-12

Visual Predictive Checks

Gelman et al. (2020)

A: Continuous data
Most commonly KDE based density plots.

C: Discrete data
Fewer commonly used tools for effective visual checks.

B: Summary statistic
Another common visual PPC. Already more task specific.

D: Data split into groups
Use tools from A - C to assess predictions for subgroups of data.

In Bayesian Workflows

  • Predictive checks present on many stages of Bayesian Workflows

  • Early stages of model building can be very exploratory

  • Clear guidelines reduce ad-hoc decisions during the exploration and assessment

    \(\Rightarrow\) fewer mistakes

Our work

Aim: Provide structured recommendations on which visual predictive checks to use.

  1. Continuous data
    • Focus of my previous WP
    • Bounds and point-masses are common challenges
      • Solution: Goodness-of-fit testing

Our work

Aim: Provide structured recommendations on which visual predictive checks to use.

  1. Count models
    • When are KDE plots good enough?
    • What to use when KDE fails to represent the data?
  1. Binomial predictions
    • Non-parametric calibration assessment
  1. Categorical and ordinal predictions
    • Extend tools for binomial data
      1. one-vs-others probability
      2. cumulative probability

Roaches - PPC

library("rstanarm")

stan_glmnb <- stan_glm(
  y ~ sqrt_roach1 + treatment + senior,
  family = neg_binomial_2,
  offset = log(exposure2),
  prior = normal(0, 2.5),
  prior_intercept = normal(0, 5),
  data = roaches
)

pp_check(stan_glmnb)
  • Default bandwidth is large.
    • E.g. a lot of mass between 0 and 1.
  • Considerable point mass at zero, also visible in the PIT ECDF.

Roaches - PPC

library("rstanarm")

stan_glmnb <- stan_glm(
  y ~ sqrt_roach1 + treatment + senior,
  family = neg_binomial_2,
  offset = log(exposure2),
  prior = normal(0, 2.5),
  prior_intercept = normal(0, 5),
  data = roaches
)

pp_check(stan_glmnb,
         bw = "sj",
         trim = T)
  • Sheather-Jones bandwidth selection algorithm often yields a more representative fit.
  • Still considerable point mass at zero.

Roaches - PPC

Rootogram

  • Emphasize discreteness of the predictive distribution

  • Our solution (bottom) returns the visualisation to a interval plot

    • More familiar to users
    • Posterior is also shown as discrete
    • Visually less busy without losing information

Roaches

Summary statistics

  • Task specific checks of quantities of interest.
    • For example, we can inspect the tail we cut out of the previous PPCs.
    • The tails of the posterior predictive distribution are expected to be thicker than the tail of the observations.

Roaches - Zero inflation

Binned calibration plots

  • Dimitriadis, Gneiting, and Jordan (2021) show how the choice of binning can cause vastly different conclusions.

Roaches - Zero inflation

PAV adjusted calibration plots

Use pool-adjacent-violators (PAV) algorithm to replace binning in calibration plots with conditional event probabilities (CEP) (Dimitriadis, Gneiting, and Jordan (2021)).

  • Estimate consistency bands from posterior samples.

Roaches - Zero inflation

Residual plots

  • Discrete outcomes make direct inspection of residuals difficult.

Roaches - Zero inflation

Residual plots

  • Discrete outcomes make direct inspection of residuals difficult.

  • Binned residual plots are a common solution

    • Suffer from binning artefacts
  • PAV adjusted CEPs offer a binning free approach.
    • Point-wise consistency intervals for model calibration obtained are from posterior samples.

Conclusions

  • KDE density plots are effective summaries
    • Remember to assess goodness-of-fit to data.
  • Binomial predictions
    • PAV adjusted calibration plots offer a good default.
    • PAVA allows for non-parametric residual plots
  • Categorical predictions
    • One-vs-others \(\rightarrow\) binomial calibration
  • Ordinal predictions
    • Cumulative probability of N-or-less \(\rightarrow\) binomial calibration

Current stage

  • Targeting submission to Journal of Visualization and Interaction (JoVI)
    • open access and open review.
    • Experimental track offers a chance to include interactivity.

Visual Predictive Checks

Prior predictive checks

Effect of hyperparameters in predictions of a GP by Gelman et al. (2013)

Prior elicitation. Exposure to air pollution by Gabry et al. (2019)

  • Check for conflicts between prior predictive distribution and domain knowledge.

References

Dimitriadis, Timo, Tilmann Gneiting, and Alexander I. Jordan. 2021. “Stable Reliability Diagrams for Probabilistic Classifiers.” Proceedings of the National Academy of Sciences 118 (8): e2016191118. https://doi.org/10.1073/pnas.2016191118.
Gabry, Jonah, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman. 2019. “Visualization in Bayesian Workflow.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 182 (2): 389–402. https://doi.org/10.1111/rssa.12378.
Gelman, Andrew, John B Carlin, Hal S Stern, David B Dunson, Aki Vehtari, and Donald B Rubin. 2013. Bayesian Data Analysis. Chapman; Hall/CRC.
Gelman, Andrew, Aki Vehtari, Daniel Simpson, Charles C. Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian Bürkner, and Martin Modrák. 2020. “Bayesian Workflow.” arXiv:2011.01808 [Stat], November. http://arxiv.org/abs/2011.01808.